In [2]:
In [3]:
Out[3]:
s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date text hashtags source retweets favorites is_retweet
0 1.340540e+18 Rachel Roh La Crescenta-Montrose, CA Aggregator of Asian American news; scanning di... 08-04-2009 17:52 405 1692 3247 False 20-12-2020 06:06 Same folks said daikon paste could treat a cyt... ['PfizerBioNTech'] Twitter for Android 0 0 False
1 1.338160e+18 Albert Fong San Francisco, CA Marketing dude, tech geek, heavy metal & '80s ... 21-09-2009 15:27 834 666 178 False 13-12-2020 16:27 While the world has been on the wrong side of ... NaN Twitter Web App 1 1 False

Information about the dataset

In [4]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 228207 entries, 0 to 228206
Data columns (total 16 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   s                 228207 non-null  float64
 1   user_name         228205 non-null  object 
 2   user_location     161296 non-null  object 
 3   user_description  211189 non-null  object 
 4   user_created      228207 non-null  object 
 5   user_followers    228207 non-null  int64  
 6   user_friends      228207 non-null  int64  
 7   user_favourites   228207 non-null  int64  
 8   user_verified     228207 non-null  bool   
 9   date              228207 non-null  object 
 10  text              228207 non-null  object 
 11  hashtags          178504 non-null  object 
 12  source            228088 non-null  object 
 13  retweets          228207 non-null  int64  
 14  favorites         228207 non-null  int64  
 15  is_retweet        228207 non-null  bool   
dtypes: bool(2), float64(1), int64(5), object(8)
memory usage: 24.8+ MB

Number of Record and Column in the data

In [5]:
Number of Row in the dataset:  228207
Number of column in the dataset:  16

Visualize the null value from the dataset

In [6]:
Out[6]:
(array([ 0.5,  1.5,  2.5,  3.5,  4.5,  5.5,  6.5,  7.5,  8.5,  9.5, 10.5,
        11.5, 12.5, 13.5, 14.5, 15.5]),
 [Text(0.5, 0, 's'),
  Text(1.5, 0, 'user_name'),
  Text(2.5, 0, 'user_location'),
  Text(3.5, 0, 'user_description'),
  Text(4.5, 0, 'user_created'),
  Text(5.5, 0, 'user_followers'),
  Text(6.5, 0, 'user_friends'),
  Text(7.5, 0, 'user_favourites'),
  Text(8.5, 0, 'user_verified'),
  Text(9.5, 0, 'date'),
  Text(10.5, 0, 'text'),
  Text(11.5, 0, 'hashtags'),
  Text(12.5, 0, 'source'),
  Text(13.5, 0, 'retweets'),
  Text(14.5, 0, 'favorites'),
  Text(15.5, 0, 'is_retweet')])

Count and calculate the percentage of null value present in the dataset

In [7]:
Out[7]:
Total_NA_Value %_NA_Value
0 s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date text hashtags source retweets favorites is_retweet
s 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
user_name 2.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
user_location 66911.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
user_description 17018.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
user_created 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
228202 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
228203 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
228204 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
228205 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
228206 NaN 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

228223 rows × 17 columns

Drop All Na Value from the dataset

In [8]:
Out[8]:
s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date text hashtags source retweets favorites is_retweet
0 1.340540e+18 Rachel Roh La Crescenta-Montrose, CA Aggregator of Asian American news; scanning di... 08-04-2009 17:52 405 1692 3247 False 20-12-2020 06:06 Same folks said daikon paste could treat a cyt... ['PfizerBioNTech'] Twitter for Android 0 0 False
2 1.337860e+18 eli🇱🇹🇪🇺👌 Your Bed heil, hydra 🖐☺ 25-06-2020 23:30 10 88 155 False 12-12-2020 20:33 #coronavirus #SputnikV #AstraZeneca #PfizerBio... ['coronavirus', 'SputnikV', 'AstraZeneca', 'Pf... Twitter for Android 0 0 False
6 1.337850e+18 Gunther Fehlinger Austria, Ukraine and Kosovo End North Stream 2 now - the pipeline of corru... 10-06-2013 17:49 2731 5001 69344 False 12-12-2020 20:06 it is a bit sad to claim the fame for success ... ['vaccination'] Twitter Web App 0 4 False
9 1.337840e+18 Ch.Amjad Ali Islamabad #ProudPakistani #LovePakArmy #PMIK @insafiansp... 12-11-2012 04:18 671 2368 20469 False 12-12-2020 19:30 #CovidVaccine \n\nStates will start getting #C... ['CovidVaccine', 'COVID19Vaccine', 'US', 'paku... Twitter Web App 0 0 False
10 1.337840e+18 Tamer Yazar Turkey-Israel Im Market Analyst, also Editor... working (fre... 17-09-2009 16:45 1302 78 339 False 12-12-2020 19:29 while deaths are closing in on the 300,000 mar... ['PfizerBioNTech', 'Vaccine'] Twitter Web App 0 0 False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
228202 1.460170e+18 VaxBLR Bengaluru, India Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 09:00 45+ #URBAN #Bengaluru #CovidVaccine Availabili... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False
228203 1.460160e+18 VaxBLR Bengaluru, India Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:30 18-44 #BBMP #Bengaluru #CovidVaccine Availabil... ['BBMP', 'Bengaluru', 'CovidVaccine', 'COVISHI... VaxBlr 0 1 False
228204 1.460160e+18 VaxBLR Bengaluru, India Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:30 18-44 #URBAN #Bengaluru #CovidVaccine Availabi... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False
228205 1.460160e+18 Gatti Valentino🐾 Southern Africa Entrepreneur, self taught cook🍲🌮 @Chelsea @Fer... 28-08-2019 10:31 8103 3113 45726 False 15-11-2021 08:03 They promote their Vaccines leaving out the st... ['SputnikV'] Twitter for Android 0 0 False
228206 1.460160e+18 VaxBLR Bengaluru, India Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:00 45+ #URBAN #Bengaluru #CovidVaccine Availabili... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False

116057 rows × 16 columns

Number of records after droping the NA Value

In [9]:
Out[9]:
116057

Find the Number of Unique value in each column

In [10]:
Out[10]:
array({'unique_values': [10665, 52302, 21255, 55923, 51835, 18827, 7386, 30366, 2, 89090, 115849, 44911, 262, 391, 880, 1]},
      dtype=object)
In [11]:
{'unique_values': [10665, 52302, 21255, 55923, 51835, 18827, 7386, 30366, 2, 89090, 115849, 44911, 262, 391, 880, 1]}

Find the most frequent data or value in each column

In [12]:
Out[12]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Most Frequent item 1455890000000000000.0 VaxBLR Bengaluru, India Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 13-08-2021 16:41 #COVAXIN vaccine approved for children aged 2 ... ['Moderna'] Twitter Web App 0 0 False
frequence 280 6618 7237 6618 6618 1269 7297 8203 101382 17 12 8173 33423 81197 49976 116057
Percent from total 0.241 5.702 6.236 5.702 5.702 1.093 6.287 7.068 87.355 0.015 0.01 7.042 28.799 69.963 43.062 100.0

Visualizations

In [13]:

Number of percentage of Username from all country and visualize the data

In [14]:
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 129417 (\N{OWL}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 127757 (\N{EARTH GLOBE EUROPE-AFRICA}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 127800 (\N{CHERRY BLOSSOM}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 27784 (\N{CJK UNIFIED IDEOGRAPH-6C88}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 35799 (\N{CJK UNIFIED IDEOGRAPH-8BD7}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 20255 (\N{CJK UNIFIED IDEOGRAPH-4F1F}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)

Number and percentage of user location from all the country

In [15]:

Number and Percentage of Sourse from the all country

In [16]:

Number and Percentage of verified user from all the country

In [17]:

Create a new dataframe and extract tweet data from only india

In [18]:
In [19]:
In [20]:
Out[20]:
s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date text hashtags source retweets favorites is_retweet useer_location
12 1.337820e+18 WION india #WION: World Is One | Welcome to India’s first... 21-03-2016 03:44 292510 91 7531 True 12-12-2020 17:45 The agency also released new information for h... NaN TweetDeck 0 18 False india
23 1.337770e+18 BOOM Live mumbai, india IFCN certified fact-driven journalism. India's... 16-03-2014 03:52 64185 1183 1794 True 12-12-2020 14:58 The US Food and Drug Administration (FDA) has ... NaN Twitter Web App 1 5 False mumbai, india
51 1.338630e+18 Dr. Taha Khan india | usa MD/MPH • PGY1 Peds/Child Neurology @theBCRP (@... 30-12-2013 08:51 855 3046 8236 False 14-12-2020 23:48 I’ve never been so excited to get a vaccine 💉💉... ['CovidVaccine', 'PfizerBioNTech', 'VaccinesSa... Twitter for iPhone 1 10 False india | usa
75 1.338570e+18 Prof. Manish Thakur india #Proprietor English Academy #Blockchain #AI #I... 11-06-2012 13:50 3372 1713 119631 False 14-12-2020 20:00 #UgurSahin #ozlemtureci the #Muslim Scientists... ['UgurSahin', 'ozlemtureci', 'Muslim', 'Pfizer... Twitter for Android 0 0 False india
94 1.338550e+18 India Blooms india A news and reference portal on India and a 24X... 10-10-2009 11:19 16816 2448 20 False 14-12-2020 18:27 Toronto to receive Ontario's 1st doses of Pfiz... ['Ontario'] Twitter Web App 0 0 False india
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
228201 1.460180e+18 VaxBLR bengaluru, india Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 09:30 18-44 #URBAN #Bengaluru #CovidVaccine Availabi... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVAXIN'] VaxBlr 0 0 False bengaluru, india
228202 1.460170e+18 VaxBLR bengaluru, india Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 09:00 45+ #URBAN #Bengaluru #CovidVaccine Availabili... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False bengaluru, india
228203 1.460160e+18 VaxBLR bengaluru, india Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:30 18-44 #BBMP #Bengaluru #CovidVaccine Availabil... ['BBMP', 'Bengaluru', 'CovidVaccine', 'COVISHI... VaxBlr 0 1 False bengaluru, india
228204 1.460160e+18 VaxBLR bengaluru, india Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:30 18-44 #URBAN #Bengaluru #CovidVaccine Availabi... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False bengaluru, india
228206 1.460160e+18 VaxBLR bengaluru, india Hourly updates on FREE and PAID 18+ and 45+ va... 21-06-2021 08:44 31 0 0 False 15-11-2021 08:00 45+ #URBAN #Bengaluru #CovidVaccine Availabili... ['URBAN', 'Bengaluru', 'CovidVaccine', 'COVISH... VaxBlr 0 0 False bengaluru, india

48787 rows × 17 columns

Find the number of unique value in each column from India

In [21]:
Out[21]:
array({'unique_values': [6961, 12004, 2091, 12050, 11856, 8515, 3139, 10127, 2, 30880, 48711, 9906, 87, 262, 547, 1, 2091]},
      dtype=object)

Number and percentage of user location from India

In [22]:

Number and percentage of user name from India

In [23]:
C:\ProgramData\anaconda3\Lib\site-packages\IPython\core\pylabtools.py:152: UserWarning: Glyph 129417 (\N{OWL}) missing from current font.
  fig.canvas.print_figure(bytes_io, **kw)

Number and Percentage of Sourse from India

In [24]:

Number and Percentage of hashtags trends in India

In [25]:

Number and Percentage of Verified trends in India

In [26]:

Visualize Word Cloud

In [27]:

The Most appear words tweets from India

In [28]:
Total tweets from India: 48787

The Most appear words tweets from Delhi and Noida City

In [66]:
Total tweets from Delhi and Noida: 7066

Hashtag Analysis

In [33]:

Data Cleaning

In [34]:
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\2382753126.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['hashtags'] = india_tweet_data['hashtags'].replace(np.nan, '[None]', regex=False)
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\2382753126.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['hashtags'] = india_tweet_data['hashtags'].apply(lambda x: x.replace('\\N', ''))
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\2382753126.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['hashtags_count'] = india_tweet_data['hashtags'].apply(lambda x: len(x.split(',')))

Plot the graph hashtag cout per tweet india

In [35]:
No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.

Total number and list of all individual hashtags

In [36]:
There are total hashtags from india:7809
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\348078607.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['hashtags_individual'] = india_tweet_data['hashtags'].apply(lambda x: x.split(','))
Out[36]:
{" 'Jaipur']",
 " 'zunpulse'",
 " 'happydoctorday']",
 " 'Krishna']",
 "['Texas']",
 " 'trendingpost'",
 " 'worldsbest'",
 " 'Apple'",
 " 'HerCircle']",
 " 'Maharashtra']",
 " 'fridayfitness'",
 " 'NIFTY']",
 " 'free'",
 "['vaccinationday']",
 " 'Diesel']",
 " 'EnemiesFromOut'",
 " 'covishielded'",
 " 'SputnikNews'",
 " 'Azerbaijan']",
 "['Zimbabwe'",
 " 'inspirational'",
 "['Covid19vaccine'",
 "['Lucknow']",
 " 'CCPVirus'",
 " 'BhupendraPatel']",
 " 'GamingSetup'",
 "['covid_19'",
 " 'ThisIsOurShot'",
 " 'Shehzada_Paapu'",
 " 'coronavaccination'",
 " 'ModiHaiTohMumkinHai']",
 "['AtmaNirbharBharat'",
 " 'sinovac']",
 " 'Covid_19'",
 " 'AnuragKashyap']",
 " 'New'",
 "['SINOVAC']",
 " 'valentinesdaygift'",
 " 'APCOVIDHelp']",
 " 'COVIDVaccination'",
 " 'sruthicrs'",
 "['igotmyshot'",
 " 'Gamaleya'",
 " 'Pushpa'",
 " 'COVIDVaccine']",
 "['FriendsTheReunion'",
 " 'Liberandus_KyaKarogeAb']",
 " 'CovidIndiaInfo'",
 " 'Cashback'",
 " 'boostershot'",
 " 'EnemiesWithin']",
 " 'MehulChoksi'",
 " 'EidAlFitr']",
 " 'Fact']",
 "['news'",
 " 'mutations']",
 " 'HT'",
 "['PfizerBooster'",
 " 'StayStrongIndia'",
 " 'covid19vaccine']",
 " 'pfeizer'",
 " 'DakuCovidUpdates']",
 " 'DeltaPlus']",
 " 'CowinApp']",
 " 'UnionHealthMinistry']",
 " 'PfizerLeak'",
 "['share'",
 " 'covidvaccine2021']",
 "['DIWALI']",
 " 'statistics'",
 " 'uspharma']",
 " 'AZD1222'",
 " 'geopolitical']",
 "['BharatBiotechLimited']",
 " 'Cowin'",
 "['wuhanVirus'",
 " 'INDIA'",
 " 'EVENT'",
 " 'podcast']",
 " 'SputnikVinIndia']",
 " 'HarshVardhan']",
 " 'Pharmaceutical'",
 " 'oxygen'",
 "['Bahrain'",
 "['GavinNewsom'",
 " 'IndianCities'",
 "['GorakhpurAirport']",
 "['Bihar'",
 " 'apolloVaccinstion']",
 "['StayHomeStaySafe'",
 " 'CanSino']",
 " 'Centre']",
 "['WhiteSupremacy'",
 " 'WuhanLabLeak'",
 "['CycloneYaas'",
 " 'ModiJi'",
 " 'Corebevax']",
 " 'HappyDiwali'",
 " 'indemnity']",
 "['SII']",
 " 'saralbharatnees'",
 "['Covidvaccine'",
 "['firstdosedone'",
 "['Khurdha']",
 "['healthcare'",
 "['MintPremium'",
 " 'indiavaxine'",
 "['avadi'",
 "['COVIDVaccinationCentres']",
 "['CurlyTales'",
 "['bankdeck'",
 "['chini'",
 "['VaccineLottery'",
 " 'JandJ'",
 " 'effective']",
 "['ENHYPENdiscography'",
 "['Dhar'",
 " 'Song'",
 " 'StockMarket'",
 "['quote'",
 "['EID']",
 "['BMI']",
 " 'tradingstrategy']",
 " 'dhruvrathee'",
 " 'EUA']",
 " 'FoodandDrugAdministration']",
 " 'MrNM56']",
 " 'MarketUpdate']",
 " 'Shower'",
 "['StartupsVaccinationDrive'",
 " 'Punit']",
 "['Cipla'",
 " 'モデルナ']",
 "['AAI'",
 "['Pakistan'",
 "['Covid19India'",
 " 'bjp_govt']",
 "['MadeinIndia'",
 " 'gallbladder'",
 "['Exclusive']",
 " 'reddy']",
 " 'WDTT']",
 " 'Coinbase']",
 "['tax'",
 "['LatestPics'",
 "['antibodies']",
 "['FordMotorCompany'",
 " 'instagramdown'",
 " 'SputnikVaccine'",
 " 'covaxin']",
 "['OneBillion'",
 "['Venezuela'",
 " 'chinesevaccine']",
 " 'ShashiTharoor'",
 " 'unlock'",
 " 'PfizerBioNTechs'",
 " 'hyderabad']",
 "['RahulGandhi']",
 " 'mohw'",
 " 'Icmr'",
 "['CANADA'",
 " 'Delhivaccination']",
 "['Mallareddy'",
 " 'covidvacccine']",
 " 'zycovD']",
 "['Walk_in'",
 "['BirthdayGirlSaada'",
 " 'Nagaland']",
 " 'English'",
 " 'covidchennai'",
 " 'bppoddarhospital']",
 " 'seruminstituteofindia']",
 " '100Crore'",
 " 'fatalities']",
 " 'TheLiveMirror']",
 " 'VaccineResearch']",
 "['Unlocked']",
 "['covaxinated'",
 "['krishnaella'",
 " 'Disability'",
 "['CycloneTauktae'",
 " 'ModiVaccine']",
 " 'Sputinkv']",
 "['schools'",
 " 'Bahrain']",
 "['PIL'",
 " 'Opposition'",
 " 'Madanapalle'",
 " 'G7']",
 "['aatmanirbharbharat'",
 " 'TeamBhopal'",
 " 'States'",
 " 'upsccurrentaffairs']",
 " 'election'",
 " 'Covisheild']",
 "['collector'",
 " 'powai']",
 "['DCGI']",
 "['AWS'",
 "['Israeli'",
 " 'ZeroCovid'",
 " 'surveillance'",
 " 'LiveAdalat']",
 " 'VaccinationGhotala'",
 " 'government'",
 "['TikaUtsav'",
 " 'GDP'",
 " 'Vaccinefor18yearsabove']",
 " 'vaccination'",
 " 'roadlesstraveled'",
 " 'second'",
 " 'jaypeehospital'",
 " 'makeinindia'",
 "['Mexico']",
 "['SEC'",
 " 'Patna'",
 "['VaccineMaitri'",
 " 'PM']",
 "['Headlines']",
 " 'UnitedKingdom']",
 " 'solo'",
 " 'Arrestmetoo']",
 " 'vaccinediplomacy'",
 " 'VaccinesSaveLives']",
 " 'johnsonandjohnsonvacccine']",
 " 'heart'",
 "['Covisheild'",
 " 'COVID19Vic'",
 "['time8news'",
 "['CoronaVaccine'",
 " 'Time'",
 "['Vavo'",
 " 'AgentFirstLook']",
 " 'GoCoronaGo']",
 " 'Bihar'",
 "['Malta'",
 "['PreityZinta'",
 " 'IndiaAgainstPropaganda'",
 "['bigdataanalytics'",
 "['PROUD'",
 " 'mobile']",
 " 'RohitSharma'",
 "['DontHesitateJustVaccinate'",
 "['Critizen']",
 " 'ValimaiUpdate']",
 " 'NASDAQ'",
 " 'spikeprotein'",
 " 'VaccineSeva'",
 " 'Pakistani']",
 "['Lavrov'",
 " 'CovidUK'",
 "['COVIDHIELD']",
 " 'Metoo'",
 " 'California'",
 "['1stdosecovid19vaccine'",
 " 'assamcovidupdate']",
 " 'Fabiflu'",
 "['CovidVaccine']",
 " 'success'",
 " 'MaharashtraGovernment']",
 "['bharatbiotechnews']",
 " 'covidupdate'",
 " 'MRNA'",
 "['encouraging'",
 " 'GodMorningFriday'",
 " 'Gurugram']",
 " 'Ghatkopar'",
 "['NationalDoctorsDay'",
 " 'doomsayers'",
 " 'Delta'",
 "['ocugen'",
 "['Biden'",
 " 'JoeBidenPresidentElect']",
 " 'Covidshield']",
 " 'NarendraModi']",
 " 'Whatsapp']",
 " 'covidshield'",
 " 'CovaxinVaccine'",
 " 'vaccinated'",
 "['DelhiHighCourt'",
 " 'Priority']",
 " 'CCP']",
 "['Indians']",
 " 'OdishaNews'",
 " 'healthcare']",
 " 'covaxinated'",
 "['PositiveVibesOnly']",
 " 'SecondWaveofCovid19'",
 "['Kuwait'",
 " 'Facebook']",
 "['justforfun'",
 "['variants']",
 "['sidhu'",
 " 'norway']",
 "['IndiaWillWin']",
 "['independence'",
 "['COVIDVACCINE']",
 "['GoodNews']",
 "['FDA'",
 " 'DelhiVaccination'",
 " 'RaquelWins'",
 "['Masubramanian'",
 " 'Chemistry']",
 " 'technology']",
 "['Japan']",
 " 'Jab'",
 "['VaccinationDrive']",
 " 'TedrosAdhanom'",
 " 'JDN'",
 " 'Adani5Vaccine'",
 " 'StocksInFocus'",
 "['100CroreVaccination']",
 " 'EY'",
 " 'World']",
 " 'coronafreeindia'",
 " 'vellore'",
 "['NMC'",
 " 'Bharat_Biotech'",
 "['Drreddy'",
 " 'cowserum'",
 " 'Indiana'",
 " 'NortheastTodaymagazine']",
 " 'newsletters'",
 " 'Covavax'",
 " 'PyramidCollege'",
 " 'JohnsonVariant']",
 " 'HealthMinistryofindia']",
 "['LeftRightCentre'",
 "['TeamIndia'",
 " 'SarahGilbert'",
 " 'AajKiBaat']",
 " 'takingmoney']",
 " 'Ludhiana']",
 "['ChartOfTheDay']",
 "['coronawarriors'",
 " 'COVIDBooster'",
 " 'dcgi'",
 " 'Galat'",
 " 'pandemia'",
 "['देश_की_शान_मोदीजी'",
 " 'IndiaNeedsOxygen']",
 "['gurugram'",
 " 'Natanz'",
 "['IPL2021'",
 " '2nddosedone'",
 " 'StaySafe']",
 " 'VaccineForAll'",
 "['greed'",
 " 'USA'",
 " 'Answer'",
 " 'RelianceJio']",
 "['MedicoverHospitals'",
 " 'हिंदी_दिवस'",
 "['ps5'",
 "['IETO'",
 " 'AshokGehlot'",
 " 'SpikeProtein']",
 " 'digital'",
 "['Remdisivir']",
 " 'DNA'",
 " 'GooglePlay'",
 " 'INDIA']",
 " 'DrKariko'",
 " 'dose2'",
 " 'pfizernews']",
 "['GobarStroke'",
 "['bloggbuzz']",
 " 'ShivSena'",
 "['modi'",
 "['research'",
 "['VennilaKabadiKuzhu'",
 " 'IMA_Junior_Doctors_Network'",
 "['Iran'",
 " 'vaccinationdrive'",
 "['emergency']",
 " 'leadership']",
 " 'ModiHaiTohMumkinHai'",
 "['covidshild']",
 " 'GAVI']",
 "['COVID19_Vaccine'",
 "['illiterate'",
 " 'PromotingVaccination'",
 "['States'",
 " 'covidcrisis'",
 "['NASA']",
 "['BengalElections2021']",
 "['Instagram'",
 " 'ZyCoVD']",
 "['BovineBlood'",
 " 'BengaliNewYear'",
 " 'NITIAayog']",
 " 'Justasking']",
 " 'amishdevgan'",
 " 'MDANEESQAMAR']",
 " 'Rural'",
 " 'stockmarkets']",
 " 'VaccineFor18Plus']",
 '[None]',
 "['ChinaExposed'",
 "['johnsonandjohnson']",
 "['Covid'",
 " 'LeadershipMatters']",
 "['MeraBharatMahan'",
 "['VaccineDeaths']",
 " 'foreign'",
 " 'covidboosterdose'",
 "['CDSCO']",
 " 'SmartAsPappu'",
 " 'zero']",
 "['currentaffairs'",
 "['MPFightsCorona'",
 " 'BECOV2A'",
 " 'Zydus'",
 " 'Masks'",
 " 'Jainism']",
 " 'effectiveness']",
 " 'PHC'",
 "['Nostock'",
 "['medicaldarpan']",
 "['UNICEF'",
 " 'Phase3']",
 " 'LetVaccinateIndia']",
 " 'Chineese'",
 " 'asap'",
 "['WhereAreVaccines'",
 " 'Kaala'",
 " 'FirstUp']",
 " 'Covidvaccines']",
 " 'UKVariant'",
 " 'Jhanjharpur'",
 " 'Restricted']",
 " 'WestBengal']",
 " 'RussianVaccine'",
 " 'BigNews']",
 " 'Coronavaccine']",
 " 'BCAunty'",
 " 'PediatricTrials'",
 " 'sarees']",
 "['AIIMS']",
 " 'staystrong']",
 " 'season2'",
 " 'supply'",
 " 'mondaythoughts'",
 "['icmr']",
 " 'Cfie'",
 " 'SRPuram'",
 " '𝐅𝐮𝐬𝐢𝐨𝐧'",
 "['NewsToday'",
 "['vaccinationeducation'",
 " 'Baahubali']",
 "['SinoVac']",
 "['PanaceaBio'",
 "['PRC'",
 " 'aashiqui2'",
 " 'donate']",
 "['BuildIndia'",
 " 'Vaccinatie'",
 " 'RGI'",
 " 'villages']",
 " 'ChellamSir']",
 " 'BJPVACCINE'",
 " 'communists']",
 " 'Europe'",
 " 'vaccinesupply']",
 "['cancelboardexams2021'",
 "['TukdeTukdeGang'",
 "['Finland'",
 "['coketail'",
 " 'VIZAG'",
 "['DeepotsavInAyodhya']",
 " 'BharaBiotech'",
 "['Exclusive'",
 " 'review'",
 "['NarendraModiPM']",
 " 'indiacovid']",
 "['indiavaccine'",
 " 'HaffkineInstitute']",
 " 'zycovd'",
 "['Chinese'",
 " 'odisha']",
 "['SuperSpreaderModi'",
 " 'FastForNation']",
 "['coronavirusindia']",
 " 'डीए_बहाल_करो'",
 "['doctor'",
 " 'Varanasi']",
 " 'maharashtra']",
 " 'Godspeed'",
 " 'NEET2021']",
 "['HarshVardhanShringla']",
 " 'source'",
 " 'covidindia']",
 "['Covid19IndiaHelp']",
 " 'covishild']",
 "['assimilated'",
 " '2DGMedicine']",
 " 'Ronaldo'",
 "['WuhanVirus']",
 " 'Health']",
 "['Covid19Update']",
 "['Nasscom'",
 " 'PicOfTheDay']",
 "['WestBengal']",
 "['PolstratUpdate'",
 " 'SecondDose'",
 " 'Oman']",
 " 'news'",
 " 'biharlockdown']",
 " 'Covidvaccines'",
 " 'CoronaVaccine'",
 "['jalandhar'",
 "['FreePressBulletin'",
 "['maderna'",
 "['global'",
 " 'indiavaccinedrive'",
 " 'Ranchi']",
 " 'bitcoin'",
 "['LetsTalkVaccination'",
 "['Osmani'",
 " 'ZyCov']",
 "['First'",
 "['CovidVacccine'",
 " 'Vaccines']",
 "['ranchi'",
 " 'VaccineManufacturing'",
 "['𝐒𝐩𝐮𝐭𝐧𝐢𝐤𝐕vaccine'",
 "['community'",
 "['crazycovidtime'",
 "['CovidIsNotOver'",
 " 'Shilpamedicare']",
 "['BombayHC']",
 " 'Covidshield'",
 "['Covid19VaccineSputnikV'",
 " 'facebookdown'",
 "['AtmanirbharBharat']",
 "['1Billion'",
 "['Concepcion']",
 " 'PfizerBiontech']",
 " 'vaccinereview'",
 " 'covidvaccines']",
 " 'NIH']",
 "['HIV'",
 " 'Vaccinationdrive']",
 "['BrekingNews'",
 "['100CroreVaccination'",
 " 'BjpDestroyedIndia']",
 "['Punjab']",
 " 'Zydus']",
 "['Ayodhya'",
 " 'emergencyuse']",
 " 'HealthMinistry'",
 "['UKEnding'",
 "['AnshumanMishra']",
 "['JoeBiden'",
 " 'VistaNahiVaccine']",
 " 'OxygenExpress'",
 "['Congressi'",
 "['bhent_india'",
 " 'EUL']",
 " 'OLED'",
 " 'COVID']",
 "['SNGLRTY'",
 "['StandWithTheStudents']",
 " 'Christian'",
 "['RajyaSabha'",
 "['JustTooCute'",
 "['OurTrustedVaccine'",
 "['vaccinediscrimination'",
 " 'HealthCare'",
 "['Valorant'",
 "['COVIDSHIELD'",
 "['Covishield_Vaccine'",
 " 'AIIMSDelhi'",
 " 'covovax'",
 " 'bengaluru'",
 " 'staysafe'",
 " 'StayStrongIndia']",
 " 'OneBillionDoses'",
 " 'covieshield'",
 " 'manishsisodia'",
 "['DrGuleria'",
 "['riyasat']",
 " 'ChristianMedicalCollege']",
 " 'Qualitycontroltest']",
 " 'टीका_उत्सव']",
 "['PharmacyoftheWorld'",
 " 'Visa'",
 " 'ChineseVaccine']",
 " 'WorldClassVax']",
 "['Belarus'",
 " 'YoModiSoBoring'",
 " 'CoronavirusStrain'",
 "['aap'",
 " 'coronathirdwave'",
 " 'patna'",
 " 'RoyalCareHospital']",
 "['DigitalMarketing'",
 " 'vaccinateunder18'",
 " 'AtamNirbharBharat']",
 "['Wednesday']",
 " 'VipulAmrutlalShah'",
 "['Russian']",
 " 'TNI']",
 " 'JohnsonAndJohnsonVaccine']",
 "['Sambalpur'",
 " 'sideeffects'",
 " 'thane']",
 "['Antenatal'",
 " 'VaccinesSaveLives'",
 "['OxygenShortage']",
 "['anandayya'",
 " 'keralaelections']",
 "['Mannkibaat']",
 " 'indianvaccine'",
 " 'seniorcitizen'",
 "['PakistanPM'",
 " 'VaccineMartini']",
 " 'Update']",
 " 'Adv26']",
 "['Kochi'",
 "['Covovax'",
 " 'AmitShah'",
 "['2nddosecovaxin'",
 " 'coronavirus'",
 "['Airlines'",
 " 'INDIAisUNION']",
 " '100daysChallenge'",
 " 'MakeYourBusinessNextLevel'",
 " 'tamilnadu'",
 " 'available'",
 " 'assam'",
 " 'Doctors'",
 " 'mumbai'",
 "['China'",
 " 'Kosovo'",
 " 'Uzbekistan'",
 "['chovidshield'",
 " 'Condition']",
 " 'vaccinememe'",
 "['COVID19Vaccine']",
 " 'FightCorona']",
 " 'Communist'",
 " 'largestVaccinationdrive'",
 " 'gurugramcity']",
 " 'thirdwave'",
 "['Covid19'",
 "['Whitefield'",
 " 'COVID19vaccines'",
 " 'AskWHO'",
 " 'whatsapp'",
 " 'Dolo650'",
 " 'photography'",
 " 'COVID19Vaccination'",
 "['Vaccinated']",
 " 'AnitaBajpai'",
 " 'BharatBiotech'",
 " 'updates'",
 "['1stDose'",
 "['modeling'",
 "['CovidVaccineIndia'",
 " 'quarantine'",
 "['Lilly'",
 " 'oxford']",
 " 'DeltaVariant'",
 " 'IPL'",
 " 'INICET'",
 "['SriLanka']",
 " 'Mumbai'",
 " 'Vaccine']",
 " 'RajyaSabha'",
 "['Italy']",
 "['Covishield']",
 " 'HaryanaLockdown'",
 " 'RandeepGuleria'",
 "['covidAwareness'",
 " 'kumbh'",
 " 'Ventilator'",
 " 'Asia'",
 "['TrendingTonight']",
 " 'RangaBilla']",
 "['covidvaccine'",
 "['Results'",
 "['proudIndian']",
 "['PrimeMinister']",
 " 'Mars']",
 " 'COVAXIN']",
 " 'Manipal']",
 " 'PMModitakenvaccine']",
 " 'modernanews'",
 "['PfizerVaccines'",
 "['WatchVideo'",
 "['CoronavirusIndia']",
 " 'CovaxinEUL']",
 " 'Bangladesh'",
 "['vaccine']",
 " 'COVIDEmergency'",
 "['IndianJournal'",
 "['katikariko'",
 " 'CovidHelp']",
 " 'latest_news']",
 " 'chinesevirus']",
 "['Nagpur'",
 " 'DrRim'",
 "['BJPFails'",
 " 'TodayInHistory'",
 "['NewsBreak'",
 " 'cowinregistration'",
 "['WestBengal'",
 "['Doctor'",
 " 'Communist']",
 " 'leavenoonebehind'",
 " 'nashik'",
 " 'Population'",
 "['Gudipadwa'",
 "['Gujarat']",
 "['VaccineFor18Plus']",
 " 'COVID_19'",
 " 'novavax'",
 " 'NXTakeOver'",
 " 'CovidVaccines'",
 " 'Election2021']",
 "['reporter']",
 " 'Haryana']",
 " 'DOSE1']",
 " 'utsav'",
 "['eul'",
 "['Haffkine'",
 "['singapore'",
 " 'COVIDー19']",
 " 'marketplaces']",
 " 'Sensex'",
 " 'tu84'",
 "['meditation']",
 "['AstraZenaca'",
 " 'EmiratesGroup']",
 " 'tirunelveli'",
 " 'DCGA'",
 " 'Drupal']",
 "['UttarPradeshgovernment']",
 "['coimbatore'",
 " 'happydiwali2021'",
 " 'TedrosAdhanom']",
 "['DHD'",
 " 'DrReddysLaboratories']",
 " 'Bbmp']",
 "['EUA']",
 " 'whoapproval']",
 " 'Tips']",
 " 'infection']",
 " 'makemytrip']",
 " 'EmergencyUseAuthorization'",
 " 'WuhanVirus'",
 " 'stocksinnews'",
 "['january']",
 " 'CoviVac']",
 "['MamataBanerjee'",
 " 'tablemountainfire']",
 " 'KonkonaSenSharma'",
 " 'travellers'",
 "['UniversalVaccine']",
 " 'handpainted'",
 " 'KolkataCovid'",
 " 'DelhiNCR'",
 "['SL'",
 "['PMNarendraModi'",
 " 'instagram']",
 " 'T20']",
 " 'Facebook'",
 " 'ᴠᴀᴄᴄɪɴᴇᴅᴀʏ']",
 " 'IMA_JDN'",
 "['Godrej'",
 " 'Pfeizergate'",
 " 'TheBillionVaccine']",
 " 'shot']",
 "['chinesevaccine'",
 "['Pfizergate'",
 "['HighCourt']",
 " 'CovidVaccination'",
 "['PFIZER'",
 " 'stelisbiopharma']",
 " 'PoliticsToday']",
 " 'Corona'",
 " 'pricecap']",
 "['IndiaDevelopmentDebate'",
 " 'Ayesha'",
 " 'businesstips']",
 " 'BanPMK'",
 " 'PMSpeech'",
 "['FDA']",
 " '2ndDose'",
 " 'MOMO'",
 " 'Walkin'",
 " 'ripvivek'",
 " 'Turkey']",
 " 'biologicalenews'",
 " 'Sputnikvaccine']",
 " 'ZyCoVD'",
 " 'P1'",
 " 'BRAZIL'",
 "['Biotechnology'",
 " 'BJP'",
 "['vaccinationforall'",
 " 'Bramhapur']",
 " 'krishnaella']",
 "['Covid']",
 " 'markofthebeast'",
 " 'StayHappy']",
 "['maheresaab'",
 " 'Exempt'",
 "['NationWithPMModi'",
 "['Centre'",
 "['JanataCurfew'",
 " 'IndiaGetsVaccinated']",
 " 'mrna'",
 "['PUNE'",
 "['Thought4DDay']",
 "['5iveLive'",
 " 'R']",
 "['govt']",
 " 'PoonawallaScam']",
 " 'SriLanka'",
 "['Baghlan'",
 " 'smartwatch'",
 " 'TCS'",
 " 'BusinessOwner'",
 "['reports'",
 " 'stock'",
 "['Educationisnottourism'",
 " 'Business'",
 " 'TV9News'",
 " 'lockdowndelhi']",
 "['JohnsonJohnson'",
 " 'islam'",
 " 'VladimirPutin'",
 " 'Maharajganj']",
 "['waterwars'",
 "['bharatbiotech'",
 "['Ramanthapur'",
 " 'Affection'",
 " 'Cotedivoire'",
 " 'Monsoon2021'",
 "['fridaymorning'",
 " 'Harvard']",
 " 'VaccineShortage'",
 "['covaxin']",
 " 'BigShot']",
 " 'EUDRAGDMP']",
 " 'Hindustan'",
 " 'GreenLight'",
 " 'prevention'",
 "['Thanks'",
 " 'Qatar']",
 " 'Lockdown']",
 " 'WPI'",
 "['myths'",
 "['NewIndia'",
 " 'JairBolsonaro']",
 " 'POLL'",
 " 'maskdown'",
 "['Gravitas'",
 "['SerumInstituteofIndia'",
 " 'Doctor'",
 " 'Google'",
 " 'PfizerCovidVaccine']",
 "['coronavaccination'",
 " 'TechnoSupport'",
 " 'WednesdayThought']",
 " 'Morning'",
 "['Pakistan']",
 " 'OurShotHoosiers'",
 " 'Sindh'",
 " 'SputnikVvaccine']",
 "['WearAMask']",
 " 'banknifty'",
 " 'profits'",
 "['actress'",
 "['bsyediyurappa']",
 " 'OnlineExams'",
 "['ThaneMayor'",
 " 'BharatTech']",
 " 'RDIF'",
 " 'postgazetteu']",
 "['CCP'",
 "['ThankYouModiJi'",
 " 'Ramdev'",
 " 'shraddhakapoor']",
 "['Chattisgarh'",
 " 'currency'",
 "['ThePharmacyoftheWorld'",
 "['drdavidnabarro'",
 " 'DELTAPLUS']",
 "['WorldsLargestVaccinationDrive'",
 "['COVIDEmergency'",
 "['amulet'",
 "['worldhealthorganisation'",
 " 'nse'",
 "['ChinaGlobalThreat'",
 " 'DNAvaccine'",
 " 'CongratulationsIndia'",
 "['Denmark'",
 " 'pregnantwomen']",
 "['goodmorning'",
 " 'MadeInIndia'",
 "['aiims']",
 " 'bjd'",
 "['Ivermectin'",
 "['Indian']",
 "['clinicaltrial'",
 " 'CovidVax'",
 "['isupportBharatBiotech']",
 "['pinarayivijayan']",
 " 'mRNA1273'",
 " '5iveLive']",
 " 'Coimbatore']",
 "['Mucormycosis'",
 " 'GovernmentOfIndia']",
 " 'Puducherry'",
 " 'covidnews'",
 " 'SecondWave'",
 "['wuhanvirus'",
 "['IIL'",
 " 'jamshedpur']",
 " 'kidney']",
 " 'odisha'",
 " 'Markets'",
 " 'Sanjeevani']",
 " 'Mauritius'",
 " 'ModiJi']",
 "['announce']",
 " 'covid19vacccine'",
 " 'SriLanka']",
 " 'fightcovid19'",
 " 'DelhiCoronaUpdate'",
 " 'CoronavirusVaccine'",
 " 'covidtoday'",
 " 'kuwait'",
 " 'SC'",
 "['ModiMaya'",
 " 'SouthAfricancovid']",
 "['Liver']",
 " 'beautiful'",
 " 'Corona']",
 " 'IsraelPalestine']",
 " 'PipiliLovesBJD']",
 " 'vaccinateeveryindian']",
 " 'CoVaxin'",
 "['JNJ'",
 " 'finalyear'",
 " 'MachineLearning']",
 "['BanegaSwasthIndia'",
 " 'ModernaVaccine'",
 "['india']",
 " 'vaccineindia'",
 " 'phc'",
 " 'KrifyVaccination'",
 " 'efficacy'",
 " 'AroundFishers']",
 "['NamakkalDistrict'",
 "['COVIDEmergencyIndia'",
 " 'businesses'",
 " 'virus']",
 " 'COVID19SL'",
 " 'MukhtarAnsari']",
 " 'Familyman2'",
 " 'NEET_PG']",
 "['MannKiBaat'",
 " 'COVID_19']",
 " 'kashmir'",
 " 'OppositionLeaders']",
 " 'MansukhMandaviya']",
 " 'Modi']",
 " 'kids']",
 " 'SputnikV']",
 "['BBMP'",
 "['RTPCR']",
 " 'mask'",
 " 'researcher'",
 " '1st']",
 " 'tamilnadu']",
 "['April'",
 "['ParliamentQuestion'",
 "['Chennai'",
 " 'Brazil'",
 " 'personality']",
 "['ModiGovt'",
 "['SputnikBreaking'",
 " 'MadhuriDixit'",
 " 'cmkejriwal']",
 " 'safeindia'",
 "['covidvaccine']",
 " 'Cadila'",
 " 'AI'",
 "['Booster'",
 "['BigBreakingnewsinIndia'",
 "['FactsVsMyths']",
 "['covidcases'",
 " 'DailyUpdate'",
 "['VaccineUpdates'",
 " 'Nation']",
 "['Antimicrobial'",
 ...}

Create a new dataframe of data

In [37]:
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\1436535826.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['datedt'] = pd.to_datetime(india_tweet_data['date'])
Out[37]:
12       2020-12-12 17:45:00
23       2020-12-12 14:58:00
51       2020-12-14 23:48:00
75       2020-12-14 20:00:00
94       2020-12-14 18:27:00
                 ...        
228201   2021-11-15 09:30:00
228202   2021-11-15 09:00:00
228203   2021-11-15 08:30:00
228204   2021-11-15 08:30:00
228206   2021-11-15 08:00:00
Name: datedt, Length: 48787, dtype: datetime64[ns]

Extract year, day, month, dayofweek,hour, minutes and store it in a new column same dataset

In [38]:
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['year'] = india_tweet_data['datedt'].dt.year
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['month'] = india_tweet_data['datedt'].dt.month
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['day'] = india_tweet_data['datedt'].dt.day
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['dayofweek'] = india_tweet_data['datedt'].dt.dayofweek
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['hour'] = india_tweet_data['datedt'].dt.hour
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:6: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['minute'] = india_tweet_data['datedt'].dt.minute
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['dayofyear'] = india_tweet_data['datedt'].dt.dayofyear
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3116021394.py:8: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['date_only'] = india_tweet_data['datedt'].dt.date

First Three Records

In [39]:
Out[39]:
s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date ... hashtags_individual datedt year month day dayofweek hour minute dayofyear date_only
12 1.337820e+18 WION india #WION: World Is One | Welcome to India’s first... 21-03-2016 03:44 292510 91 7531 True 12-12-2020 17:45 ... [[None]] 2020-12-12 17:45:00 2020 12 12 5 17 45 347 2020-12-12
23 1.337770e+18 BOOM Live mumbai, india IFCN certified fact-driven journalism. India's... 16-03-2014 03:52 64185 1183 1794 True 12-12-2020 14:58 ... [[None]] 2020-12-12 14:58:00 2020 12 12 5 14 58 347 2020-12-12
51 1.338630e+18 Dr. Taha Khan india | usa MD/MPH • PGY1 Peds/Child Neurology @theBCRP (@... 30-12-2013 08:51 855 3046 8236 False 14-12-2020 23:48 ... [['CovidVaccine', 'PfizerBioNTech', 'Vaccine... 2020-12-14 23:48:00 2020 12 14 0 23 48 349 2020-12-14

3 rows × 28 columns

Number of count of Tweet datawise

In [40]:
Out[40]:
date_only count
0 2020-12-12 4
1 2020-12-13 12
2 2020-12-14 9

Plot the Line Graph of count number of tweets per day of year in india

In [41]:
In [42]:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[42], line 1
----> 1 plot_time_variation_graph(india_tweet_agg_data, title='Number of tweet per day of year in india',size=3)

Cell In[41], line 3, in plot_time_variation_graph(df, x, y, hue, size, title, is_log)
      1 def plot_time_variation_graph(df, x='data_only',y='count', hue=None, size=1, title='', is_log=False):
      2     f,ax = plt.subplots(1,1,figsize=(6*size,3*size))
----> 3     g = sns.lineplot(x=x,y=y, hue=hue, data=df)
      4     plt.xticks(rotation=90)
      5     if hue:

File C:\ProgramData\anaconda3\Lib\site-packages\seaborn\relational.py:618, in lineplot(data, x, y, hue, size, style, units, palette, hue_order, hue_norm, sizes, size_order, size_norm, dashes, markers, style_order, estimator, errorbar, n_boot, seed, orient, sort, err_style, err_kws, legend, ci, ax, **kwargs)
    615 errorbar = _deprecate_ci(errorbar, ci)
    617 variables = _LinePlotter.get_semantics(locals())
--> 618 p = _LinePlotter(
    619     data=data, variables=variables,
    620     estimator=estimator, n_boot=n_boot, seed=seed, errorbar=errorbar,
    621     sort=sort, orient=orient, err_style=err_style, err_kws=err_kws,
    622     legend=legend,
    623 )
    625 p.map_hue(palette=palette, order=hue_order, norm=hue_norm)
    626 p.map_size(sizes=sizes, order=size_order, norm=size_norm)

File C:\ProgramData\anaconda3\Lib\site-packages\seaborn\relational.py:365, in _LinePlotter.__init__(self, data, variables, estimator, n_boot, seed, errorbar, sort, orient, err_style, err_kws, legend)
    351 def __init__(
    352     self, *,
    353     data=None, variables={},
   (...)
    359     # the kind of plot to draw, but for the time being we need to set
    360     # this information so the SizeMapping can use it
    361     self._default_size_range = (
    362         np.r_[.5, 2] * mpl.rcParams["lines.linewidth"]
    363     )
--> 365     super().__init__(data=data, variables=variables)
    367     self.estimator = estimator
    368     self.errorbar = errorbar

File C:\ProgramData\anaconda3\Lib\site-packages\seaborn\_oldcore.py:640, in VectorPlotter.__init__(self, data, variables)
    635 # var_ordered is relevant only for categorical axis variables, and may
    636 # be better handled by an internal axis information object that tracks
    637 # such information and is set up by the scale_* methods. The analogous
    638 # information for numeric axes would be information about log scales.
    639 self._var_ordered = {"x": False, "y": False}  # alt., used DefaultDict
--> 640 self.assign_variables(data, variables)
    642 for var, cls in self._semantic_mappings.items():
    643 
    644     # Create the mapping function
    645     map_func = partial(cls.map, plotter=self)

File C:\ProgramData\anaconda3\Lib\site-packages\seaborn\_oldcore.py:701, in VectorPlotter.assign_variables(self, data, variables)
    699 else:
    700     self.input_format = "long"
--> 701     plot_data, variables = self._assign_variables_longform(
    702         data, **variables,
    703     )
    705 self.plot_data = plot_data
    706 self.variables = variables

File C:\ProgramData\anaconda3\Lib\site-packages\seaborn\_oldcore.py:938, in VectorPlotter._assign_variables_longform(self, data, **kwargs)
    933 elif isinstance(val, (str, bytes)):
    934 
    935     # This looks like a column name but we don't know what it means!
    937     err = f"Could not interpret value `{val}` for parameter `{key}`"
--> 938     raise ValueError(err)
    940 else:
    941 
    942     # Otherwise, assume the value is itself data
    943 
    944     # Raise when data object is present and a vector can't matched
    945     if isinstance(data, pd.DataFrame) and not isinstance(val, pd.Series):

ValueError: Could not interpret value `data_only` for parameter `x`

In [43]:

Number and Percentage of tweet per day of week in india

In [44]:

Number and Percentage of Tweet per date in India

In [45]:

Number and Percentage of tweet per hour in India

In [46]:

Number and Percentage of tweet per minute in India

In [47]:

Apply Sentiment Intensity analyzer

In [48]:
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: vaderSentiment in c:\users\jay pal\appdata\roaming\python\python311\site-packages (3.3.2)
Requirement already satisfied: requests in c:\programdata\anaconda3\lib\site-packages (from vaderSentiment) (2.31.0)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\programdata\anaconda3\lib\site-packages (from requests->vaderSentiment) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests->vaderSentiment) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests->vaderSentiment) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib\site-packages (from requests->vaderSentiment) (2023.7.22)
Positive
In [49]:
C:\Users\Jay Pal\AppData\Local\Temp\ipykernel_17500\3765676.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  india_tweet_data['sentiment'] = india_tweet_data['text'].apply(lambda x: sentiment_analysis(x))
In [50]:
Out[50]:
s user_name user_location user_description user_created user_followers user_friends user_favourites user_verified date ... datedt year month day dayofweek hour minute dayofyear date_only sentiment
12 1.337820e+18 WION india #WION: World Is One | Welcome to India’s first... 21-03-2016 03:44 292510 91 7531 True 12-12-2020 17:45 ... 2020-12-12 17:45:00 2020 12 12 5 17 45 347 2020-12-12 Positive
23 1.337770e+18 BOOM Live mumbai, india IFCN certified fact-driven journalism. India's... 16-03-2014 03:52 64185 1183 1794 True 12-12-2020 14:58 ... 2020-12-12 14:58:00 2020 12 12 5 14 58 347 2020-12-12 Negative
51 1.338630e+18 Dr. Taha Khan india | usa MD/MPH • PGY1 Peds/Child Neurology @theBCRP (@... 30-12-2013 08:51 855 3046 8236 False 14-12-2020 23:48 ... 2020-12-14 23:48:00 2020 12 14 0 23 48 349 2020-12-14 Positive

3 rows × 29 columns

Visualize the count of sentiment

In [51]:

Vizualize the wordcloud from the positive sentiments tweet

In [68]:
Total tweets Positive text: 48787

Vizualize the wordcloud from the negative sentiments tweet

In [69]:
Total tweets negative text: 48787

Vizualize the wordcloud from the netual sentiments tweet

In [70]:
Total tweets Netual text: 48787

Thankyou!